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Abstract 

Background: Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs 
through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because 
blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of 
blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Am by stoma 
mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. 

Results: The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 
transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known 
proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value 
exceeding 10" 5 . We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of 
orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were 
in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and 
novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the 
candidate genes by using quantitative polymerase chain reaction and in situ hybridization. 

Conclusion: The results of this study demonstrate that de novo transcriptome sequencing allows gene expression 
analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources 
for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms 
underlying blastema formation during limb regeneration. 
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Background 

Ambystoma mexicanum (axolotl), one of over 500 spe- 
cies of salamander, can completely reconstitute lost 
limbs after amputation. The amputation of limbs results 
in the formation of blastemas in the stump ends. These 
blastemas contain undifferentiated cells capable of grow- 
ing and developing into new limbs exactly as they were 
before amputation [1]. In the early phase of regener- 
ation, growing wound epithelium and epidermis cover 
the ends of the truncated nerves and the surface of the 
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amputation site within several hours [2-4]. After the 
nerves and wound epidermis contact each other, the epi- 
dermis overlying the axon ends thickens, forming an ap- 
ical epithelial cap [5]. Fibroblasts from the surrounding 
tissue simultaneously migrate to the amputation site 
under the apical epithelial cap. These fibroblasts prolifer- 
ate to form a mass of undifferentiated cells that subse- 
quently develops into the new limb. In the absence of 
functional nerves, an apical epithelial cap and blastema 
cannot be formed on the amputation stump [6]. Instead, 
denervated limbs undergo a wound-healing response 
post- amputation, and do not regenerate [7,8]. 

In past several years, next-generation sequencing (NGS) 
technology has become a cutting-edge approach for high- 
throughput sequence determination. This technology has 
dramatically improved the efficiency and speed of gene 
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discovery in many studies [9,10], and has significantly 
accelerated and improved the sensitivity of gene expres- 
sion profiling. For example, studies in the field of human 
[11] and Arabidopsis [12] transcriptomics have made re- 
markable progress. Studies using transcriptome sequen- 
cing for organisms with complete genome sequencing 
have confirmed that the short-read products of NGS can 
be effectively assembled and used for gene discovery and 
comparison of gene expression profiles. 

National Center for Biotechnology Information (NCBI) 
[13-15] and Sal-Site [16] have already made available 
many cDNA sequene databases for axolotl. However, 
there still exist many unannotated protein-coding genes 
or non-coding RNA sequences that have not been sam- 
pled previously and are not present in existing cDNA li- 
braries. Recent reports have also suggested that more 
Ambystoma to human non-redundant (nr) orthologous 
sequences remain to be discovered [15]. Moreover, se- 
quence coverage of transcripts is highly variable between 
different cDNA libraries. With more available cDNA se- 
quences, the overall sequence coverage of axolotl tran- 
scripts will be improved. Although previous studies have 
highlighted the usefulness of cDNA sequencing for the 
discovery of candidate genes in the absence of a genome 
sequence database, a comprehensive description of the 
full spectrum of genes expressed in axolotl blastemas is 
still lacking. To our knowledge, the genome sequencing 
of any salamander species has not been completed. 

Several studies have used highly parallel 454 pyro- 
sequencing to identify axolotl sequences which are used 
to generate a large-scale feature axolotl microarray 
[14,15,17,18]. However, 454 pyrosequencing has relatively 
lower overall transcriptome coverage when compared to 
Illumina/Solexa platforms [19-21]. Several recent studies 
have employed the Illumina/Solexa platform to offer a far 
greater coverage than 454 pyrosequencing [19-21]. How- 
ever, in the early stages of this platform, the majority of 
Illumina sequence reads could not be matched to known 
genes because of their short length. In general, 454 
pyrosequencing had longer sequence reads whereas 
Illumina sequencing had shorter, but more numerous 
paired ends read [19-21]. Currently, the latest develop- 
ments in 454 and Illumina technologies offer higher reso- 
lution and are relatively consistent with each other. With 
improved quality and longer reads, the higher coverage 
from Illumina technologies allows for the identification 
of low- abundance genes not detected in earlier studies of 
limb regeneration based on 454 pyrosequencing. There- 
fore, Illumina platforms are well suited for gene discovery 
and promising insights into axolotl limb regeneration. 

The de novo transcriptome sequencing of axolotl 
blastema in this study produced over 4 billion bases of 
high-quality cDNA sequences, which were assembled and 
annotated without a reference genome. 116,787 distinct 



sequences, including hundreds of developmental genes and 
wound-healing genes were identified. The gene expression 
profiles of a regenerating blastema and a non-regenerating 
denervated limb stump, 14 days post-amputation, were 
compared using differential gene expression analysis. A 
list of genes significantly overexpressed in normal regen- 
erating blastema was obtained from the results of the 
analysis. A subset of genes from this list was verified 
using quantitative polymerase chain reaction (qPCR) and 
in situ hybridization. 

Results and discussion 

Illumina NGS and read assembly 

To obtain the gene expression profiles of an axolotl blas- 
tema and a non-regenerating denervated limb stump, 
mRNA samples of both types of tissue, 14 days post- 
amputation, were prepared and sequenced using an 
Illumina sequencing platform. We performed a single se- 
quencing run for each of the 2 tissues. After cleaning and 
quality assurance, we obtained 40 million 100-bp reads 
from both samples (Table 1). These raw reads were ran- 
domly clipped into 21-mers for sequence assembly using 
SOAP de-novo [22] to yield 2,920,951 contigs (Table 1) 
with a mean contig size of 117 bp. The size distribu- 
tion of these contigs is shown in Additional file 1. In 
a previous report [14], Monaghan et al. generated over 
1.7 million reads and approximately 400,000 unique se- 
quences with a mean contig size of 215 bp for axolotl 
from a broader range of regeneration stages using 454 
pyrosequencing. Using paired-end joining and gap-filling, 
we further assembled these contigs into 307,345 tran- 
scribed sequences with a mean size of 373 bp including 
20,504 transcribed sequences larger than 1000 bp (Table 1). 
After clustering using TIGR Gene Indices clustering tools 
(TGICL) [23], the 307,345 transcribed sequences generated 
116,787 unigenes with a mean size of 529 bp (Table 1). In 
this report, the term "unigene" indicates a transcribed 



Table 1 Summary for Ambystoma mexicanum 
transcriptome 



Total number of reads 


40,688,892 


Total base pairs (bp) 


4,862,000,340 


Average read length 


100 bp 


Total number of contigs 


2,920,951 


Mean length of contigs 


117 bp 


Total number of transcribed sequences 


307,345 


Mean length of transcribed sequences 


373 bp 


Transcribed sequence N50 


562 bp 


Total unigenes 


116,787 


Mean length of unigenes 


529 


Unigenes with E-value < 10" 5 


39,200 
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sequence that matches no other transcribed sequence and 
has distinct sequences. 

The data sets are available at the NCBI Short Read 
Archive (SRA) with the accession number: SRA064951. 

Annotation of predicted proteins 

For annotation, BLASTx was used to compare unigenes 
against the NCBI nr database by using a cut-off 
E-value of 10~ 5 . Using this approach, 39,200 genes (33.6% 
of unigenes) were found to be above the E-value cut-off 
(Additional file 2). Because of the relatively short length 
of unigene sequences (mean size of 529 bp) and lack of 
genome reference in axolotl, most of the 77,587 (66.4%) 
assembled sequences could not be matched to any 
known genes. Figure 1A shows that the proportion of 
assembled sequences with matches in the nr database is 
higher among the longer sequences. Specifically, se- 
quences longer than 2000 bp had a match efficiency of 
80.6%, whereas the match efficiency decreased to ap- 
proximately 43.8% for sequences ranging from 500 to 
1000 bp in length, and to 23.9% for sequences between 
200 and 500 bp in length (Figure 1A). The E-value distri- 
bution of the top matches in the nr database showed that 
40% of the mapped sequences have strong homology 
(smaller than 1.0E-50), whereas 60% of the homologous 
sequences ranged from 1.0E-5 to 1.0E-50 (Figure IB). 



The sequence similarity distribution has a comparable 
pattern, with 24.8% of the sequences having a similarity 
higher than 80%, and 75.2% of the sequences having a 
similarity ranging from 14% to 80% (Figure 1C). 

Gene ontology and classification of clusters of 
orthologous groups 

Gene ontology (GO) assignments were used to classify 
the functions of the predicted axolotl genes. Based on 
sequence homology, 15,633 sequences can be catego- 
rized into 48 functional groups (Figure 2). In each of 
the 3 main categories of GO classification (i.e., biological 
process, cellular components, and molecular function), 
"cellular process", "cell part", and "binding" terms are 
dominant, respectively. However, we found a few genes 
in the categories: "virion", "virion part", "metallochaperone 
activity", and "electron carrier activity". We also noticed a 
high percentage of genes in the "developmental process" 
and "response to stimulus" categories, but only a few genes 
for "cell killing" and "antioxidant activity". To further 
evaluate the completeness of this transcriptome library and 
the effectiveness of the proposed annotation process, 
we searched the annotated sequences for the genes 
involved in clusters of orthologous groups (COG) clas- 
sifications. Of the 39,200 matches to the nr database, 
9744 sequences have a COG classification (Figure 3). 




■ 14%-40% ■40%-60% 60%-80% 

■ 80%-100%"100% 

Figure 1 Characteristics of homology search of lllumina sequences against the nr database. (A) Effect of query sequence length on the 

percentage of sequences for which significant matches were found. The proportion of sequences with matches (with a cut-off E-value of 1.0E-5) 

in the NCBI nr databases is greater among the longer assembled sequences. (B) E-value distribution of BLAST hits against the nr database for 

each unique sequence with a cut-off E-value of 1.0E-5. (C) Similarity distribution of the top BLAST hits for each sequence. 
\ J 
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Figure 2 Histogram presentation of Gene Ontology classification. Results are summarized in three main categories: biological process, 
cellular component, and molecular function. The right y-axis indicates the number of genes in a category. The left y-axis indicates the percentage 
of a specific category of genes in the main category. 



Among the 25 COG categories, the cluster for "general 
function prediction" represents the largest group (3785, 
20.7%), followed by "replication, recombination, and re- 
pair" (1901, 10.4%) and "transcription" (1436, 7.9%). 
The following categories represent the smallest groups: 



"extracellular structures" (9, 0.049%), "nuclear structure" 
(7, 0.049%), and "RNA processing and modification" (79, 
0.43%) (Figure 3). To identify the biological pathways ac- 
tive in the axolotl, we mapped unigenes to canonical ref- 
erence pathways using the Kyoto Encyclopedia of Genes 
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Figure 3 Histogram presentation of clusters of orthologous groups (COG) classification. Out of 39,228 nr hits, 9744 sequences have a COG 
classification among the 25 categories. 
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and Genomes (KEGG) [24]. In total, 24,095 sequences 
were assigned to 217 KEGG pathways. The pathways 
with the most representation by the unique sequences 
were starch and sucrose metabolism (2801 members, 
11.62%), pathways in cancer (1126 members, 4.67%), and 
regulation of actin cytoskeleton (1029 members, 4.27%). 
These annotations provide a valuable resource for inves- 
tigating specific processes, functions, and pathways in 
axolotl limb regeneration research. 

Detection of growth factor and transcription factor 
sequences related to limb regeneration in axolotls 

Fibroblast growth factors (FGFs) participate in salamander 
limb regeneration [25-29]. For example, fgf-8 is expressed 
in the basal layer of the apical epithelial cap [30]. Develop- 
mental genes are also re-expressed in blastema mesenchy- 
mal cells [31-36]. For example, homeobox A13, an autopod 
marker in many vertebrates, appears in the distal region of 
the blastema [32,37,38]. Therefore, sequences related to 
growth factors and transcription factors involved in regen- 
eration were analyzed and compared to sequences from 
NCBI nucleotides and the EST database. As Table 2 shows, 
this process identified a number of sequences that are 
homologous to growth factors related to limb regeneration, 
such as transforming growth factor (tgf)-/3s and fgfs, as well 
as transcription factors involved in limb regeneration, such 
as Homeobox (hox) genes. In total, this study identified 6 
tgf-f) sequences. After removing redundant sequences, 3 
different tgffi isoforms, 17 different hox genes, and 7 dif- 
ferent fgfs were identified (Table 3). Four tgf-fis (tgf-fil, 2, 3, 
and 5) have been identified previously in axolotl [39]. This 
study identifies 3 {tgf-fil, 2, 3) and represents a nearly 
complete collection of such genes in axolotl. These find- 
ings confirm the high quality and high yield of the sequen- 
cing data in this study, and may facilitate axolotl limb 
regeneration research. 

Comparing gene expression profiles between blastema 
and denervated limb stump 

To identify specific genes participating in limb regener- 
ation, the differentially expressed genes between blastema 
and non-regenerating denervated limb stump were ana- 
lyzed using 3 different de novo assemblers to increase 



accuracy: Trinity [40], trans- ABySS [41], and SOAP de- 
novo [42]. BLASTx was utilized to compare the assembled 
contigs from the 3 respective assemblers with the protein 
database of Xenopus laevis. The contigs with the best 
e-values were assigned to represent the potential axolotl 
genes. The fold change of the reads per kilo base of isotig 
length per million mapped reads (RPKM) values between 
the libraries from the two tissue types was calculated to 
identify differentially expressed genes. The numbers of dif- 
ferentially expressed genes for each assembler are: Trinity 
2651, trans-ABySS 2693, and SOAP de-novo 2833. Because 
various assemblers deliver different sets of differentially 
expressed genes, only genes homologous to a Xenopus 
laevis protein with corresponding contigs from all the 3 as- 
semblers were selected to obtain more accurate results. 
With this strategy, we detected 1953 genes with differential 
expression levels between blastema and denervated limb 
stump libraries. Of these differentially expressed genes, 885 
were up-regulated and 1068 were down-regulated in blas- 
tema (Additional file 3). Among the up-regulated genes, 
anterior gradient rescues a denervated limb stump to be- 
come regenerating again [43]; matrix metalloproteinase 13 
(mmpl3) is associated with metamorphic developmental 
processes in axolotl epidermis [44]; sox9 is essential for 
sclerotome development and cartilage formation [45]; and 
patchedl mediates hedgehog signaling into blastema [46]. 
The set of differentially expressed genes from this study 
was compared with previous sequencing projects using 
454 pyrosequencing and microarray [14] (Additional 
file 4). Only the data sets in the previous studies on tissues 
14 days after amputation, similar to this study, were chosen 
for comparison. The unigenes that could not be annotated 
were excluded here. The Venn diagrams in Additional 
file 4 reveal overlaps between these 3 data sets. Among 
the overlapped genes, genes known to be necessary for 
limb development and limb regeneration were identified 
including wntSa which is involved in Wnt/planar cell 
polarity signaling, and msx2 which plays important roles 
in the development of limb buds. 

Functional annotation of differentially expressed genes 

To understand the functions of differentially expressed 
genes, the set of 1953 differentially expressed genes 



Table 2 Comparison of the sequence numbers of transforming growth factor-j3s, fibroblast growth factors, and hox 
genes identified in this study and in an EST database 

Gene name °Number of sequences had ^Number of known sequences c Number of known sequences in 

a hit with nr database from NCBI nucleotide database an EST sequencing project [13] 

Transforming growth factor-^s 6 1 1 

Fibroblast growth factors 8 5 2 

Hox genes 28 7 1 

a Number of sequences obtained in this study that have a hit with the corresponding proteins in the NCBI nr database. 

b Number of sequences from NCBI nr database hit in a that show homology with corresponding proteins from NCBI nucleotide database (to July 2012). 
c Number of sequences that can be identified in an EST database established by an ambystoma mexicanum EST sequencing project [13]. 



Table 3 Identified transforming growth factor-fis, fibroblast growth factors, and Hox genes 



Gene name 


Gene ID 


Length 


Subject ID 


Species 


E value 


Transforming growth factor-beta-1 


Unigene8146_AII 


3212 


gi 


1 60,334,206|gb|ABX24523.1 1 


Ambystoma mexicanum 


0 


Transforming growth factor-beta-2 


Unigene17537_AII 


2048 


gi 


155,966,093|gb|ABU41003.1| 


Anser anser 


0 


Transforming growth factor-beta-3 


Unigene9879_AII 


866 


gi 


257,173|gb|AAB23575.1| 


Gallus gallus 


5.00E-19 


Fibroblast growth factor-2 


Unigenel 14478_AII 


512 


gi 


15,21 6,301 |dbj | BAB63249.1 


Cynops pyrrhogaster 


9.00E-80 


Fibroblast growth factor-8 


Unigenel 16687_AII 


1666 


gi 


1 2,024,873|gb|AAG45674.1 |AF1 90448J | 


Ambystoma mexicanum 


1.00E-117 


Fibroblast growth factor-9 


Unigene82354_AII 


1419 


gi 


61,971,458|gb|AAX581 14.1 1 


Didelphis albiventris 


1.00E-113 


Fibroblast growth factor- 10 


Unigene32567_AII 


2235 


gi 


1 9,031,1 93|gb|AAK59700.1| 


Ambystoma mexicanum 


1.00E-113 


Fibroblast growth factor- 12 


Unigene43917_AII 


396 


gi 


238,81 5,007|gb|ACR 56700.1 1 


Ovis aries 


5.00E-44 


Fibroblast growth factor- 16 


Unigenel 13330_AII 


434 


gi 


224,098,374|ref|XP_0021 95984.1 1 


Taeniopygia guttata 


7.00E-36 


Fibroblast growth factor- 1 7-1 ike 


Unigene88737_AII 


636 


gi 


301, 75 7,980 [ ref |X P_0029 1 4827.1 1 


Ailuropoda melanoleuca 


4.00E-43 


HoxA3 


Unigenel 81 17_AII 


1271 


gi 


220,898,1 79|gb|ACL 81 435.1 1 


Latimeria menadoensis 


1.00E-115 


HoxB3 


Unigene92817_AII 


203 


gi 


37,528,843|gb|AAQ92347.1| 


Pleurodeles waltl 


4.00E-16 


HoxD3 


Unigene38161_AII 


710 


gi 


1 26,341 ,823|ref|XP_001 362908.1 1 


Monodelphis domestica 


7.00E-72 


HoxD4 


Unigene74557_AII 


206 


gi 


301,1 28,902|emb|CBL5 9364.1 1 


Scyliorhinus canicula 


3.00E-06 


HoxA7 


Unigenel 5254_AII 


478 


gi 


6,754,234|ref|NP_034585.1| 


Mus musculus 


1 .00E-27 


HoxB8 


Unigenel 071 83_AII 


283 


gi 


256,01 4,548|gb|ACU56828.1 1 


Ichthyophis cf. kohtaoensis JMW-2009 


2.00E-18 


HoxD8 


Unigenel 050_AII 


1896 


gi 


7,271 ,820|gb|AAF44632.1 |AF224263_2| 


Heterodontus francisci 


6.00E-95 


HoxA9 


Unigenel 15485_AII 


641 


gi 


1 49,633,999| ref |XP_00 1 50942 1 .1 1 


Ornithorhynchus anatinus 


7.00E-72 


HoxC9 


Unigenel 08008_AII 


294 


gi 


220,898,209|gb|ACL81463.1| 


Latimeria menadoensis 


4.00E-34 


HoxD9 


Unigenel 01 797_AII 


242 


gi 


1 1 8,093,579|ref|XP_001 234507.1 1 


Gallus gallus 


3.00E-25 


HoxA10 


Unigene8140_AII 


2019 


gi 


301,607,691 |ref|XP_002933440.1 1 


Xenopus (Silurana) tropicalis 


1.00E-127 


HoxCIO 


Unigenel 1 561 3_AII 


670 


gi 


1 1 ,037,556|gb|AAG27630.1 |AF2981 85_1 1 


Ambystoma mexicanum 


1.00E-101 


HoxDIO 


Unigenel 14687_AII 


531 


gi 


1 94,043,944|ref|XP_001 92501 1 .1 1 


Sus scrofa 


2.00E-75 


HoxC11a 


Unigenel 0971 8_AII 


323 


gi 


30 1 ,6 1 4,392 1 ref | XP_002936692. 1 1 


Xenopus (Silurana) tropicalis 


3.00E-39 


HoxC12 


Unigene86529_AII 


681 


gi 


24,637,1 76|gb|AAN63593.1 |AF434200_1 1 


Pleurodeles walti 


5.00E-68 


HoxA13 


Unigenel 15397_AII 


626 


gi 


220,898,1 76|gb|ACL81 432.1 1 


Latimeria menadoensis 


1.00E-103 


HoxD13 


Unigenel 16723_AII 


1926 


gi 


30 1 ,6 1 2,439| ref |XP_002935 723. 1 1 


Xenopus (Silurana) tropicalis 


1.00E-123 
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which had corresponding contigs from all 3 assemblers 
were mapped to terms in the GO biological process 
database and compared with the whole transcriptome 
background to identify genes involved in important bio- 
logical processes (Figure 4). Among all genes with GO 
assignment annotations, 1848 differentially expressed 
genes between blastema and denervated limb stump librar- 
ies were annotated. Among the differentially expressed 
genes, genes involved in nitrogen and DNA metabolism 
were significantly up-regulated in blastema, such as glucose 
dehydrogenase, GPI mannosyltransferase, and DNA repli- 
cation licensing factor mem. These findings suggest that 
the cell proliferation rate in axolotl blastema is higher than 
in denervated limb stumps, a finding consistent with that 
of a previous report [46]. 

Validation of differentially expressed genes by qPCR 

Among the genes classified as differentially expressed in 
blastema, 47 regeneration-related genes were selected 
based on our interests and their differential expression 
levels were verified by qPCR. Amplification of an endogen- 
ous gene, S21 ribosomal RNA, was used for normalization. 
Among the 39 genes determined by sequence analysis as 



up-regulated in blastema, qPCR results confirmed that 33 
genes had higher expression levels in blastema (Figure 5A). 
Results also confirmed a > 10-fold higher expression 
level for msxl in blastema. Msxl is involved in the 
regulation of dedifferentiation for mature myofibers. 
The morpholino-based suppression of Msxl protein 
expression in myonuclei significantly inhibits the frag- 
mentation and dedifferentiation of salamander myofibers 
[47]. In the initially developed limb bud in mice, msxl 
appears in the entire mesenchyme. However, this expres- 
sion is restricted to the most distal part of the limb bud 
as development proceeds [48]. When msxl -negative 
proximal limb tissues are grafted into the distal portion 
of the limb bud, msxl is re-expressed in the mesenchyme 
of the transplanted tissues, which subsequently become 
dedifferentiated [49]. The homeo box gene hoxdl3 was 
shown to be expressed at higher levels in blastema. In 
mice, the hoxdl3 gene plays a key role in axial skeleton 
development and forelimb morphogenesis [50]. The 
present sequence analysis shows that fgfs were up- 
regulated in blastema, and this pattern was partly con- 
firmed by showing up-regulation of fgf-8b %.n&fgf-9 in 
qPCR analysis. The expression patterns of fgfs in this 
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Figure 4 GO assignment of genes up- or down-regulated in balstema compared with the control non-regenerating limb stump. The 

results of GO biological function assignment annotation of differentially expressed genes are summarized in two main categories: biological 
process and molecular function. 
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Figure 5 qPCR validation of the differentially expressed genes related to regeneration. Thirty nine genes determined as up-regulated 
(A) and 8 genes determined as down-regulated (B) in the blastema by the RPKM values were validated using qPCR. Genes are represented 
by gene names. Bar values represent mean±SE of three independent measurements. 



study are consistent with those reported previously 
[30]. We selected 8 genes classified as down- regulated in 
blastema to be validated by qPCR. The results confirm 
that 5 of the 8 genes had lower expression levels in blas- 
tema. Among them, Wnt-9a and Wnt-2b showed higher 
expression levels in the denervated limb stump than in 
blastema (Figure 5B). 

Expression patterns of differentially expressed genes in 
limb regeneration 

Because of the technical difficulty of completely isolat- 
ing blastema from the underlying stump and epidermis, 
we could not exclude the possibility that the higher 



expression levels of certain genes in blastema might be 
attributed to contaminated neighboring tissues. Thus, 
in situ hybridization may reveal the specific cells ex- 
pressing the genes in this study. The hybridization 
showed that patched-2, one of the genes differentially 
expressed in blastema, was expressed in the cells lo- 
cated in blastema (Figure 6A). Hybridization using the 
sense control probe showed no signal (Figure 6B). The 
signal of patched-2 expression was limited to only a 
portion of blastemal cells, and no signal was shown in 
the epidermis (Figure 6A). These results confirmed 
that patched-2 was differentially expressed in blastema, 
as seen from NGS and qPCR, and that it was expressed 
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Figure 6 In situ hybridization analysis of patched-2 and pax7 expression in the regenerating blastemas. Potched-2 (A) or pax7 (C) signals 
(blue) using respective anti-sense probes are shown in the distal tip of regenerating limbs. No signal was detected in the negative-control 
sections using respective sense probes of patched-2 (B) and pax7 (D) on serially sectioned slides. The insets in the left upper are the enlarged photos of 
respective areas indicated by dotted lines at blastema. Asterisks indicate apical epithelial cap covering the blastema. Scale bars indicate 200 pim. 



in blastemal cells rather than contaminated tissues dur- 
ing tissue sampling. Similarly, pax7 was also shown 
expressed in blastemal cells (Figure 6C). Hybridi- 
zation using the sense control probe showed no signal 
(Figure 6D). Hedgehog signaling, the signals mediated 
by activated membrane patched-1 and 2, is required for 
blastema growth and control of dorsoventral patterning 
[51]. During limb development, hedgehog signaling 
up-regulates fgf4 to promote limb bud proliferation 
[52]. Pax7-positive cells also appear in tail blastemas 
and mid bud-stage limb blastemas [42,53]. Pax7 is re- 
quired for the specification of myogenic satellite cells, 
which are myogenic progenitor cells. These results 
demonstrate that NGS procedures can identify poten- 
tial markers for the blastemal progenitor cells that 
initiate limb regeneration. 

Conclusion 

This study uses Illumina sequencing technology to se- 
quence the axolotl transcriptomes, perform assembly 
and annotation on these sequences, and analyze the dif- 
ferentially expressed genes. More than 116,787 distinct 



sequences were produced with 39,200 sequences below a 
BLAST cut-off threshold of 10" 5 . Along with various 
existing axolotl transcriptome databases [13-15], these 
results can further help researchers investigate axolotls 
limb regeneration. This study also demonstrates the 
feasibility of using multiple assemblers to increase the 
identification accuracy of the differentially expressed genes 
involved in axolotl limb regeneration. 

Methods 

Animal experimental procedures 

Wild-type and white mutant (d/d) axolotls were kept at 
14-20°C in tap water at pH 6.5-7.5. Juvenile axolotls 
with an approximate 80 to 90 mm snout to vent length 
were subjected to denervation at the brachial plexus 
(third to fifth spinal nerves) of the right upper limbs, 
leaving the nerve ends a gap of at least 5 mm. After 1 
wk, amputation was performed at both forearms at the 
mid-radius/ulna. At 14 d post-amputation, the full- 
thickness skin at the tip of limb was gently peeled away 
by cutting through the full-thickness skin around the 
circumference of the limb with spring scissors, and the 
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blastemas in the left forearms and non-regenerating 
stump tissues in the right forearms were collected for 
total RNA extraction. All the surgical experiments were 
conducted under anesthesia with 0.1% MS -222 (Sigma- 
Aldrich, St. Louis, MO, USA). Animal care and experi- 
mental procedures were approved by the Institutional 
Animal Care and Use Committee of National Taiwan 
University College of Medicine. 

RNA extraction and library preparation 

Total RNA was extracted using Trizol Reagent (Invitrogen, 
Carlsbad, CA, USA) and isolated by RNeasy mini-columns 
(Qiagen, Germantown, MD, USA). RNA quality was 
assessed by spectrophotometry. A fragmentation buffer 
(100 mM ZnCl 2 in 100 mM Tris-HCl pH7) was added 
to cut mRNAs into short fragments. Fragmented RNA 
was reverse-transcribed to first-strand cDNA with re- 
verse transcriptase (Invitrogen) in the presence of a 
random hexamer-primer (Invitrogen) and dNTPs for 50 
min at 42°C. The second-strand cDNA was synthesized 
using DNA polymerase I in a buffer containing dNTPs 
and RNaseH. The short DNA fragments were purified 
using the QiaQuick PCR extraction kit (Qiagen) and re- 
solved with an elution buffer (10 mM Tris-Cl, pH 8.5) 
for end reparation and addition of a poly(A) fragment 
to both ends. Thereafter, the short fragments were 
connected with sequencing adapters and then separated 
in gels by electrophoresis. The fragments with a size suit- 
able for NGS were cut from gels and eluted for PCR 
amplification by using adapter primers. 

Analysis of lllumina sequencing results 

Four analytic procedures were conducted on the RNA- 
seq data. 

lllumina sequencing and de novo assembly 

The 2 cDNA libraries were sequenced from both the 5' 
and 3' ends on the lllumina GA II following the manu- 
facturer s instructions. The low-quality raw sequences 
were removed. The remaining short reads with their 
adaptor sequences trimmed were assembled in a de novo 
process using 3 assemblers: Trinity [40], trans- ABySS 
[41], and SOAP de-novo [42]. Similar assembly parame- 
ters were used for the 3 assemblers to compare perform- 
ance. Trinity and SOAP de-novo used the default k-mer 
setting [40,42], and trans- ABySS was run using k-mer 
values of 45, 46, to 89 [41]. All assemblies were performed 
on a server with 24 cores and 128 GB of memory. After as- 
sembly, the contigs longer than 100 bases were used for 
subsequent analysis. 

Functional annotation 

BLASTx [53] was performed to align the assembled 
contigs from the 3 assemblers to the nr protein database 



for functional annotation. The e-value cut-off was set at 
1E-5 for further analysis. Each assembled contig was 
assigned with the gene name and related function based 
on the best BLASTx hit (the smallest e-value). Assembled 
contigs assigned to the same gene were further compared, 
and the contig from the best e-value was adopted. If there 
was a tie between 2 assembled sequences, the one with the 
largest sequence identity was selected. In the end, with re- 
spect to each assembler, a unique contig was used to repre- 
sent a potential gene of axolotl, and this gene was named 
by the corresponding protein in Xenopus laevis. 

Quantization of transcript sequences 

Quantitative analysis was performed using CLCbio (CLC 
Genomics Workbench 4.8, parameter settings: minimum 
length fraction, 0.5; minimum similarity fraction, 0.95; 
maximum number of hits for a read, 10). The reads 
from 2 libraries were mapped to the selected assembled 
contigs for various assemblers. The read counts accumu- 
lated on the selected contigs were further normalized as 
RPKM values. 

Identification of differentially expressed genes 

The fold change of RPKM values (the RPKM is replaced 
with 0.01 if it equals zero) from the libraries of the two tis- 
sue types was calculated to identify differentially expressed 
genes. Because diverse assemblers deliver different sets of 
potential axolotl genes, only those homologous to a Xen- 
opus laevis protein and having corresponding contigs from 
all the 3 assemblers were further analyzed. The genes with 
2-fold up- or down-regulation that were consistent among 
the 3 assemblers were identified. 

qPCR for mRNA quantification 

RNA was prepared using Trizol Reagent (Invitrogen). 
The RNA samples of the blastema tissue and dener- 
vated stump end were harvested from the upper limbs 
14 days after amputation. The total RNA was reverse- 
transcribed to first-strand cDNA with reverse transcriptase 
(Invitrogen) in the presence of a random hexamer-primer 
(Invitrogen) and dNTPs for 50 min at 42°C. The expres- 
sion levels of specific mRNAs were determined by qPCR 
using respective primer pairs (Additional file 5). Each reac- 
tion was conducted in a total volume of 20 [tL containing 
50 ng first-strand cDNA, 10 2X Fast SYBR Green Mas- 
ter Mix (Applied Biosystems, Carlsbad, CA, USA), and 
each primer pair at 0.5 [*M. A Step One™ Real-Time PCR 
system (Applied Biosystems) was used. Data was analyzed 
using Step One™ software version 2.1 (Applied Biosystems). 
Error bars indicate RQ max and RQ min. 

In situ hybridization 

Templates of cDNA for axolotl patched- 2(319 bp) and 
pax7 (303 bp) were amplified by RT-PCR using respective 
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sense primers linked to SP6 promoter sequence and anti- 
sense primers linked to T7 promoter sequence. Sense and 
anti-sense RNA probes were synthesized from the cDNA 
templates using a digoxigenin RNA labeling kit fol- 
lowing the manufacturers protocol (Roche, Indianapolis, 
IN, USA). RNA probes were prepared using respect- 
ive primers: patched-2 (anti-sense), 5' CGATTTAGG 
TGACACTATAGAAGAGTCCAACGACGTGAGGACA 
AGA- 3'; patched-2 (sense), 5'- CGTAATACGACTC 
ACTATAGGGAGATTGAGCTGGATGGCGTGAA-3'; 
pax7 (anti-sense), 5'-CGATTTAGGTGACACTATAG 
AAGAGAAACAGGCAGGAGCCAATCAA-3'; and pax7 
(sense), 5'-CGTAATACGACTCACTATAGGGAGAGCT 
GACCGGATTCATGTGGTT-3'. 

Blastema tissues were fixed overnight at 4°C in 4% 
paraformaldehyde in lx phosphate buffered saline (PBS) 
and subsequently embedded in Tissue-Tek (Thermo Sci- 
entific, Miami, FL, USA) and frozen at -80°C until sec- 
tioning at 10 /mi. Before hybridization, sections were 
digested with 1 /ug/mL proteinase K in lx PBS at room 
temperature for 8 min, fixed again in 4% paraformalde- 
hyde in lx PBS at room temperature for 20 min, and 
carbonated with diethypyrocarbonate in lx PBS. The 
slides were covered with Hybri-well Press-seal hybridi- 
zation chambers ( Sigma- Aldrich) and hybridized over- 
night at 58°C with pre-denatured DIG-labeled probes in 
a hybridization solution (Roche). After hybridization, the 
slides were washed in 5x SSC twice, each for 1 h, at 65°C, 
and then in O.lx SSC for 30 min at room temperature. 
The slides were rinsed in a buffer containing 100 mM 
Tris-HCl (pH 7.5) and 150 mM NaCl. The slides were 
incubated at 4°C overnight in the same buffer containing 
an alkaline phosphatase-conjugated anti-digoxigenin 
antibody (Roche) diluted at 1:1000. The slides were 
washed in PBST (0.1% TritonX-100 in PBS) 3 times for 
30 min each at room temperature. The signals of alka- 
line phosphatase activities bound on the anti-DIG-anti- 
body were detected using a mixture of the BCIP/NBT 
solution (Sigma- Aldrich). 

Additional files 



genes able to be annotated, which are based for comparison in the 
Venn diagrams. 

Additional file 5: Primer pairs used in qPCR for validating the 
differentially expressed genes. 
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